Picture for Xulong Zhang

Xulong Zhang

Attention-weighted Centered Kernel Alignment for Knowledge Distillation in Large Audio-Language Models Applied to Speech Emotion Recognition

Add code
Feb 02, 2026
Viaarxiv icon

CARE: Multi-Task Pretraining for Latent Continuous Action Representation in Robot Control

Add code
Jan 30, 2026
Viaarxiv icon

MIRRORTALK: Forging Personalized Avatars Via Disentangled Style and Hierarchical Motion Control

Add code
Jan 30, 2026
Viaarxiv icon

Head-Aware Visual Cropping: Enhancing Fine-Grained VQA with Attention-Guided Subimage

Add code
Jan 30, 2026
Viaarxiv icon

CycleFlow: Leveraging Cycle Consistency in Flow Matching for Speaker Style Adaptation

Add code
Jan 03, 2025
Viaarxiv icon

ESARM: 3D Emotional Speech-to-Animation via Reward Model from Automatically-Ranked Demonstrations

Add code
Nov 20, 2024
Viaarxiv icon

Semi-Supervised Self-Learning Enhanced Music Emotion Recognition

Add code
Oct 29, 2024
Figure 1 for Semi-Supervised Self-Learning Enhanced Music Emotion Recognition
Figure 2 for Semi-Supervised Self-Learning Enhanced Music Emotion Recognition
Viaarxiv icon

IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding

Add code
Sep 29, 2024
Figure 1 for IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding
Figure 2 for IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding
Figure 3 for IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding
Figure 4 for IDEAW: Robust Neural Audio Watermarking with Invertible Dual-Embedding
Viaarxiv icon

Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning

Add code
May 28, 2024
Figure 1 for Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning
Figure 2 for Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning
Figure 3 for Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning
Figure 4 for Enhancing Emotion Recognition in Conversation through Emotional Cross-Modal Fusion and Inter-class Contrastive Learning
Viaarxiv icon

RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval

Add code
May 28, 2024
Figure 1 for RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval
Figure 2 for RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval
Figure 3 for RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval
Figure 4 for RREH: Reconstruction Relations Embedded Hashing for Semi-Paired Cross-Modal Retrieval
Viaarxiv icon